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ABSTRACT 

Over  the  last  five  to  seven  years  the  use  of  chat  in  military  contexts  has  expanded  quite  significantly,  in  some  cases 
becoming  a  primary  means  of  communicating  time-sensitive  data  to  decision  makers  and  operators.  For  example,  during 
humanitarian  operations  with  Joint  Task  Force-Katrina,  chat  was  used  extensively  to  plan,  task,  and  coordinate  pre¬ 
deployment  and  ongoing  operations.  The  informal  nature  of  chat  communications  allows  the  relay  of  far  more 
information  than  the  technical  content  of  messages.  Unlike  formal  documents  such  as  newspapers,  chat  is  often  emotive. 
"Reading  between  the  lines"  to  understand  the  connotative  meaning  of  communication  exchanges  is  now  feasible,  and 
often  important.  Understanding  the  connotative  meaning  of  text  is  necessary  to  enable  more  useful  automatic 
intelligence  exploitation.  The  research  project  described  in  this  paper  was  directed  at  recognizing  user  connotations  of 
uncertainty  and  urgency.  The  project  built  a  matrix  of  speech  features  indicative  of  these  categories  of  meaning, 
developed  data  mining  software  to  recognize  them,  and  evaluated  the  results. 

Keywords:  connotative  meaning,  chat  communications,  text  processing,  intelligence  analysis 


1.  INTRODUCTION 

Chat  has  become  an  important  command  and  control  medium,  not  to  replace  existing  formal  communications,  but  to 
enhance  them  by  allowing  timelier,  more  accurate,  and  more  reliable  planning,  directing,  and  controlling  of  forces 
pursuant  to  the  mission  assigned14.  Eovito’s  chat  use  assessment  states  that  warfighters  choose  to  use  chat  because  it  is 
fast,  convenient,  dependable,  and  efficient2.  Chat  messages  can  be  quickly  disseminated  to  everyone  involved  in 
preparing  an  operation,  allowing  them  to  begin  their  preparation  without  delay.  Furthermore,  collaboration  among  chat 
users  does  not  require  looking  up  electronic  mail  addresses,  telephone  numbers,  or  radio  network  identifications. 
Military  chat  users  surveyed  felt  that  without  the  use  of  chat,  their  situation  awareness  would  be  diminished,  and 
information  dissemination  and  coordination  would  be  more  difficult13.  In  2003,  the  United  States  Navy  conducted  a 
survey  of  chat  usage  by  those  on  deployment  for  Operation  Iraqi  Freedom;  the  majority  of  the  one  hundred  eight  three 
respondents  indicated  they  used  chat  for  over  seven  hours  per  day,  six  to  seven  days  per  week1 .  The  increase  in  military 
chat  use  has  made  automatic  processing  of  chat  text  necessary  to  provide  for  automated  data  collection,  collation,  and 
usage  in  new  capabilities  such  as  tactical  updates,  post-mission  operational  analysis,  and  watch  turnover. 

Unlike  formal  documents,  such  as  newspapers,  chat  is  often  emotive,  which  allows  the  relay  of  far  more  information 
than  just  the  technical  content  of  messages.  "Reading  between  the  lines"  to  understand  the  connotative  meaning  of 
communication  exchanges  is  now  feasible  and  may  become  important  for  sounding  alerts,  for  understanding  behavior 
for  after-action  reviews,  for  participant  identification  verification,  and  for  data  collection  and  analysis. 
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The  remainder  of  this  paper  is  organized  as  follows.  Section  2  discusses  the  background  for  this  research.  Section  3 
describes  the  techniques  used  to  recognize  connotations  of  uncertainty  and  urgency  expressed  in  chat  messages,  and  the 
results  of  these  techniques  are  discussed  in  Section  4. 

2.  BACKGROUND 

Most  text  analysis  research  to  date  has  been  on  grammatical,  well-formed  text,  such  as  articles  from  the  Wall  Street 
Journal.  Analysis  of  chat  text  offers  new  challenges  due  to  its  dynamic  nature.  Chat  messages  often  include  misspellings, 
extra  or  missing  capitalization,  improper  grammar  constructs,  non-standard  punctuation,  abbreviations,  interwoven 
conversations,  and  other  unique  characteristics.  Some  of  the  processing  methodologies  for  linguistic  analysis  of 
grammatical  text  are  being  expanded  to  account  for  the  special  characteristics  of  text  chat  data  (three  of  many  examples: 
[Srihari  and  Schwartzmyer;  2007],  [Berube,  et  al;  2007]  and  [Carpenter;  2008]).  A  number  of  other  research  studies  are 
attempting  to  detect  less  concrete  aspects  of  chat  communications.  Some  of  them  have  focused  on  detecting  general 
emotion  cues  ([Glazer;  2002],  [Hancock,  et  al;  2007]).  Other  topics  of  chat  study  include  the  detection  of  empathy  ([Pfeil 
and  Zaphiris;  2007]),  the  detection  of  verbal  irony  ([Hancock,  2004]),  and  the  detection  of  certainty  (or  confidence)  and 
the  measurement  of  the  polarity  of  chat-detected  sentiments,  for  example,  negative/positive  and  favorable/unfavorable 
([Liddy;  2004]). 


3.  APPROACH 

The  objectives  of  this  research  were  to:  (1)  conduct  a  study  of  how  humans  recognize  connotative  cues  expressing 
uncertainty  and  urgency,  (2)  formulate  linguistic  and  non-linguistic  means  for  recognizing  those  cues,  (3)  develop 
prototype  algorithms  to  automatically  perform  recognition,  and  (4)  evaluate  the  prototype  recognition  algorithms.  A 
combination  of  off-the-shelf  tools  and  novel  approaches  were  utilized  in  algorithm  development,  and  standard 
information  extraction  metrics  were  used  for  performance  evaluation.  Four  hundred  fifty-nine  lines  of  chat  data  from 
military  exercises  were  used  for  this  research. 

3.1  Rule-based  analysis:  identifying  and  detecting  cues 

A  manual  review  of  the  data  was  performed  to  attempt  to  understand  what  cues  were  indicative  of  uncertainty  and 
urgency.  The  data  was  reviewed  for  linguistic  and  non-linguistic  cues.  Linguistic  cues  were  those  such  as  the  use  of 
particular  words  and  phrases.  Non-linguistic  cues  under  investigation  included  terse/lengthy  responses  (presuming  that 
lengthy  responses  are  very  rarely  used  under  circumstances  of  urgency  or  uncertainty),  the  use  of  capitalization, 
punctuation  (including  ellipsis),  abbreviations,  irregular  spelling,  and  metadata  values. 

Often,  one  sign  of  emotion  in  general  chat  communications  is  the  use  of  capitalized  words  as  a  means  of  indicating  high 
emotions  or  angry  screaming.  We  found  that  the  use  of  capitalization  in  our  military  chat  data  is  used  for  catching 
attention  or  alerting  other  chat  participants  to  important  information;  it  is  rarely,  if  ever,  used  for  “screaming.” 

Punctuation  has  been  referred  to  as  the  ‘prosody  of  online  communication’7,  providing  the  equivalent  of  speech 
intonation  in  text  to  relay  connotative  meaning.  In  many  ways  chat  communications  are  similar  to  transcribed  spoken 
dialogue.  For  instance,  they  often  contain  interjections,  such  as  “ah!”  and  “drat!”  However,  in  a  distinction  from  general 
chat,  the  military  chat  interjections  we  observed  rarely  included  the  identifying  punctuation. 

Abbreviations  that  are  common  in  general  chat  communications,  such  as  “msg”  (“message”)  and  “thx”  (“thanks”),  were 
present  in  our  dataset  along  with  an  additional  set  of  chat  abbreviations  that  are  specific  to  military  communications  (for 
instance,  “w/u”  to  mean  “wheels  up”).  [ALSA;  2009],  a  developing  document  to  facilitate  coordination  of  military  chat 
use,  recommends  avoiding  “civilian  convenience”  abbreviations  and  includes  a  table  of  standardized  chat  terminology. 
Some  abbreviations  are  easily  recognizable  and  commonly  used  in  civilian  chat,  for  example  “arr”  (arrived),  “neg” 
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(negative),  and  “unk”  (unknown);  other  abbreviations  are  unique  to  the  military  domain.  The  data  used  for  this  project 
was  not  limited  by  the  restrictions  suggested  in  [ALSA;  2009]. 

Irregular  spelling  could  be  accidental  misspelling,  potentially  due  to  rushed  typing,  or  a  purposeful  expression  (ex. 
“riiiight”  as  an  indication  of  the  mental  dawning  of  agreement,  as  opposed  to  “right”  as  an  indication  of  simple, 
immediate  agreement).  As  in  transcribed  speech,  ellipsis,  the  trail  of  dots  that  indicates  an  incomplete  thought  or  an 
omission  of  words  (ex.  “Well,  if  that’s  so. . .”),  is  very  common  in  both  general  and  military  chat  communications. 

Metadata  values,  such  as  the  identification  of  the  chat  participant  and  the  associated  temporal  label  at  the  beginning  of 
each  message,  are  distinctive  characteristics  of  chat  communication  that  are  not  available  in  formal  texts  and  can  offer 
valuable  information  for  processing  systems.  For  example,  for  our  purposes  in  this  project,  knowledge  of  the  functional 
role  and  status  of  particular  speakers  could  have  been  important  input  to  the  determination  of  connotative  intent. 
However,  that  information  was  not  available  to  us.  The  temporal  component  of  the  metadata  was  determined  to  be  of  no 
use  in  recognizing  either  urgency  or  uncertainty  in  this  project.  Exchanges  were  sometimes  made  across  more  than  one 
room  (question  is  made  in  one  room,  answer  is  given  in  another),  communication  seemed  lax  with  lengthy  response 
times  (possibly  due  to  the  fact  that  it  is  data  from  an  exercise),  and  dialogue  sequences  were  difficult  to  untangle. 

3.1.1  Cues  for  uncertainty  and  urgency 

We  found  uncertainty  and  urgency  cues  to  be  quite  subtle  in  the  data.  In  our  definition  of  uncertainty,  we  were  looking 
for  messages  expressing  more  than  a  simple  need  for  information.  For  example,  the  single  message  “What  time  are  we 
striking?”  with  no  other  questions  near  it  would  be  considered  a  simple  request  for  information.  However,  when  there 
are  multiple  questions  in  one  message  or  across  consecutive  messages,  the  person(s)  involved  is  (are)  more  likely  to  be 
demonstrating  a  state  of  confusion  (that  is,  uncertainty).  Urgency,  from  our  manual  review  of  the  data,  seemed  to  be 
fairly  cut  and  dry,  and  dependent  on  keywords.  Messages  that  ended  with  “ASAP”,  “immediately”,  or  “press”  were  very 
likely  to  be  expressing  urgency.  Messages  ending  with  “now”  were  a  little  more  difficult  to  categorize,  as  the  message 
could  be  “Get  this  done  now”,  or  it  could  be  “I’m  working  this  now”.  The  first  may  be  expressing  urgency;  the  second  is 
more  of  a  status  update.  Other  than  keywords,  we  did  not  recognize  any  syntax  that  seemed  to  express  urgency.  As 
noted  earlier,  capitalization  did  not  provide  significant  evidence  of  urgency  in  our  data,  as  it  is  used  largely  just  to  catch 
the  attention  of  the  intended  recipients.  The  use  of  capitalized  “NO”  to  indicate  urgency  was  a  rare  exception. 
Exclamation  points  were  rarely  used,  and  usually  did  not  convey  urgency. 

The  cues  we  found  for  both  uncertainty  and  urgency  appeared  to  give  varying  levels  of  confidence,  so  confidence  scores 
of  1  to  5  (5  indicating  the  highest  confidence)  were  attached  to  each  cue.  Within  a  message,  it  is  possible  for  multiple 
cues  for  one  connotative  meaning  to  be  present;  in  these  cases,  the  scores  of  each  cue  are  added  to  determine  the 
confidence  level  of  the  indicated  connoted  meaning.  Table  1  lists  the  cues  and  scores  that  were  developed  through  the 
manual  review,  an  explanation  for  each  cue,  and  the  connoted  meaning.  Note  that  examples  given  are  very  simple  and 
intended  only  for  illustration  of  the  syntax  being  described. 

3.1.2  Rule-based  analysis:  detecting  cues 

It  was  determined  that  most  of  the  cues  recognized  during  the  manual  review  could  be  captured  by  regular  expressions 
(recognizable  patterns  that  can  be  interpreted  into  software  code).  A  prototype  Java  software  program  was  developed  to 
perform  the  recognition  of  rule-based  cues.  One  of  the  cues  (#7:  “Which  [noun]?”)  would  require  recognition  of  the 
classification  of  a  word  as  a  ‘noun’  by  a  software  parser  and,  although  the  rule  is  probably  pertinent,  it  would  not  have 
been  applied  many  times  in  relation  to  the  time  and  effort  it  would  have  taken  to  implement  it.  Therefore,  rule  #7  was 
not  implemented  in  the  software. 
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Table  1 :  Cues  developed  by  manual  review  of  data 


Cue 

Description 

Explanation 

Connoted 

Meaning 

Points 

(1-5) 

1 

Two  or  more  questions  in  one 
message. 

One  speaker,  one  message,  with  two  or  more  questions.  More 
questions  within  one  message  indicate  more  uncertainty. 

uncertainty 

5 

2 

Questions  with  an  option. 

A  question  that  gives  a  choice. 

Example: 

“Should  target  A  be  our  priority,  or  is  target  B  more 
important?” 

uncertainty 

4 

3 

One  speaker  with  two  or 
more  questions  in 
consecutive  messages. 

Example: 

Person  A:  “Are  we  striking  at  1400?” 

Person  B:  “affirmative,  strike  at  1400.” 

Person  A:  “copy,  what  are  the  coords  for  the  strike?” 

Person  B:  “56N  138W” 

uncertainty 

4 

4 

Two  or  more  consecutive 
questions  across  speakers. 

In  consecutive  messages,  regardless  of  speaker,  each  message 
has  at  least  one  question. 

Example: 

Person  A:  “Are  we  striking  at  1400?” 

Person  B:  “Is  the  location  still  56N  138W?” 

uncertainty 

4 

5 

Multiple  question  marks  at 
the  end  of  a  question. 

More  question  marks  usually  mean  more  uncertainty. 

uncertainty 

3 

6 

“understand”  and  a  question 
mark  in  a  message. 

Example: 

“I  don’t  understand.  Weren’t  we  targeting  A?” 

uncertainty 

3 

7 

“Which  [noun]?” 

Self-explanatory. 

uncertainty 

3 

8 

Question  and  ellipsis  in  one 
message. 

Examples: 

“What  time  are  we  striking?  I  lost  the  info. . .” 

“Do  you  know  who  we  are  looking  for. . .  ?” 

uncertainty 

2 

9 

Ellipsis 

Sentence  within  a  message  ends  with 

uncertainty 

1 

1 

0 

“ASAP,”  “immediately,”  or 
“press”  at  the  end  of  a 
sentence. 

Self-explanatory. 

urgency 

4 

1 

1 

“now”  at  the  end  of  a 

sentence. 

Self-explanatory. 

urgency 

3 

1 

2 

Capitalized  NO. 

Example:  “NO  impact” 

urgency 

3 

1 

3 

“hot”  somewhere  in  the 
message. 

Example: 

“Going  hot  with  target  A” 

urgency 

2 

3.2  Statistical  analysis:  maximum  entropy 

Maximum  entropy  (MaxEnt)  is  a  statistical  modeling  technique  in  which  a  dataset  from  a  seemingly  random  process  is 
used  to  make  predictions  about  future  data  output.  For  this  project,  the  OpenNLP  group's  Maximum  Entropy  package11, 
open  source  code  written  in  Java,  was  given  a  subset  of  our  data  for  which  each  message  was  tagged  with  a  label  of  our 
conclusion  of  connotative  content  as  a  training  set.  Training  data  was  derived  from  chat  examples  other  than  the  testing 
dataset,  from  the  same  data  source  and  event.  Approximately  twenty  samples  representing  each  of  uncertainty,  urgency, 
and  “other”  were  used  for  training.  The  program  used  this  training  data  to  develop  a  set  of  features  containing 
information  about  chat  statements  that  contain,  according  to  our  training  data,  urgency,  uncertainty,  and  neither  (other). 
When  the  trained  system  was  then  applied  to  the  test  data,  it  automatically  classified  chat  statements  as  containing  cues 
of  urgency,  of  uncertainty,  or  other. 
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3.3  Combined  rule-based  and  statistical  analysis 


In  an  attempt  to  make  the  best  use  of  each  of  the  methodologies,  we  applied  a  combination  of  our  rule-based  analysis 
and  the  maximum  entropy  statistical  analysis.  Software  code  was  written  to  combine  MaxEnt  and  our  cues  for  a  parallel 
analysis.  For  each  message  within  each  of  three  datasets,  the  decisions  of  MaxEnt  and  our  cue  table  are  considered 
together  and  final  results  are  produced  as  shown  in  Table  2.  Cue  table  confidence  scores  were  divided  by  five  to  force  a 
basis  for  comparison  with  MaxEnt  confidence  scores.  If  the  decisions  of  MaxEnt  and  the  cue  table  are  the  same,  then  the 
final  decision  of  the  combined  algorithm  is  that  same  decision.  If  MaxEnt  indicates  urgency  and  the  cue  table  indicates 
uncertainty,  then  the  final  decision  will  be  determined  by  the  highest  confidence  score  between  them.  Finally,  if  MaxEnt 
indicates  urgency  or  uncertainty  and  the  cue  table  indicates  other,  the  final  decision  is  based  on  the  MaxEnt  confidence 
score.  If  the  MaxEnt  confidence  score  is  greater  than  .6,  then  the  final  decision  will  match  the  MaxEnt  decision  for  that 
message,  otherwise  the  final  decision  is  other. 


Table  2:  Parallel  analysis  with  MaxEnt  and  cue  table 


Cue  Table  D 

Urgency 

Uncertainty 

Other 

MaxEnt 

Urgency 

Urgency 

Highest  Confidence 
Scorer 

If  MaxEnt  Confidence  Score  >  .6, 
Urgency;  Otherwise,  Other 

Uncertainty 

Highest 

Confidence  Scorer 

Uncertainty 

If  MaxEnt  Confidence  Score  >  .6, 
Uncertainty;  Otherwise,  Other 

Other 

Highest 

Confidence  Scorer 

Highest  Confidence 
Scorer 

Other 

4.  RESULTS 


4.1  Information  extraction  metrics 

Recall  and  precision  are  commonly  used  performance  measures  for  tasks  similar  to  this  project.  The  meanings  of  recall 
and  precision  can  be  clarified  by  the  Venn  diagram  of  Figure  1  in  which  the  circle  on  the  left  represents  all  of  the 
information  of  interest  in  the  dataset  (that  is,  the  ground  truth)  and  the  circle  on  the  right  represents  the  information 
selected  by  the  software  analysis.  The  rectangle  represents  the  entire  dataset  (the  Universe). 


Figure  1:  Venn  Diagram.  The  rectangle,  labeled  U,  represents  all  of  the  data  (that  is,  the  Universe).  The  circle  on  the  left  (a  + 
c)  represents  all  of  the  information  of  interest.  The  circle  on  the  right  (a  +  b)  represents  the  information  selected  by  an 
automatic  analysis.  Therefore,  the  intersection  a  represents  the  information  of  interest  that  was  correctly  identified  by 
automatic  analysis. 

A  recall  measure  represents  the  amount  of  correct,  relevant  information  that  was  identified  in  comparison  to  the  total 
amount  of  relevant  information  (that  is,  the  ground  truth)  within  the  dataset.  A  recall  score  of  1.0  would  mean  that  all  of 
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the  relevant  information  was  correctly  identified.  It  is  a  measure  of  the  completeness  of  the  data  identified.  The  equation 
for  recall,  as  represented  by  the  diagram  of  Figure  1,  is: 

Recall  =  -^—  (1) 

a+c 

A  precision  measure  represents  the  amount  of  information  that  was  correctly  identified  in  comparison  to  the  amount  of 
all  of  the  information  that  was  identified  by  the  analysis.  The  equation  for  precision  is  shown,  below.  A  perfect  precision 
score  of  1.0  means  that  all  of  the  information  selected  as  being  relevant  is  actually  relevant.  Note  that  this  wouldn’t 
necessarily  mean  that  all  of  the  relevant  information  in  the  dataset  has  been  detected. 

Precision  =  — (2) 
a+b  v  7 

The  F-measure  is  the  weighted  harmonic  mean  of  precision  and  recall,  useful  for  comparing  capabilities  of  systems  as  a 
single  measure.  For  some  analysis  applications,  one  of  either  recall  or  precision  may  be  more  highly  valued  and  that 
would  determine  the  weight  of  each  of  them  in  the  calculation  of  the  F-measure.  The  research  metric  traditionally  used  is 
the  balanced  F-score,  with  evenly  weighted  recall  and  precision: 

_  2  x  (precision  x  recall ) 

F= -  (3) 

precision  +  recall 

4.2  Analysis  of  Results 

Our  attempts  to  recognize  urgency  were  unsuccessful.  The  cues  we  thought  we  observed  were  vague  to  begin  with  and 
focused  on  keyword  matching.  After-test  review  of  the  data  showed  that  some  overgeneration  was  caused  by  rule  12  of 
Table  1  that  looked  for  a  capitalized  “NO”  as  an  indication  of  urgency,  but  the  rule  matched  numerous  references  to  a 
chat  participant  whose  function  name  included  the  word  NO  within  it.  Improving  the  recognition  of  urgency  would 
require  a  completely  new  look  at  the  problem.  Urgency  might  be  better  recognized  if  the  time  between  chat  entries  and  a 
count  of  misspelled  words  were  used  as  cues. 

The  results  for  recognizing  uncertainty  using  the  cue  table  alone  were  also  disappointing,  but  the  results  of  using  the 
parallel  algorithm,  as  well  as  the  further  scores  achieved  for  the  cue  table  by  manipulation  of  data  rules  were 
encouraging  and  point  towards  some  validation  of  the  project’s  direction;  the  scores  achieved  in  this  project  were 
comparable  to  scores  achieved  in  very  early  analyses  such  as  the  third  Message  Understanding  Conference  of  1991. 
Table  3  shows  the  values  for  precision,  recall  and  the  balanced  F-score  for  analysis  of  the  dataset  for  each  of  the  analysis 
methodologies  -  rule-based  cue  analysis,  maximum  entropy,  and  parallel  analysis  for  uncertainty.  Precision,  recall,  and 
F-scores  are  multiplied  by  100,  as  had  been  the  practice  of  the  DARPA  funded  Message  Understanding  Conference 
(MUC)  evaluations. 

The  cue  table  recall  score  was  40.48  and  the  precision  score  was  46.58.  Manual  review  of  the  labeling  indicated  that  a 
large  amount  of  the  overgeneration  by  the  cue  table  rule-based  algorithm  was  due  to  one  particular  rule  -  the  ellipsis  rule 
(rule  #9  in  Table  1).  The  rule  labeled  every  chat  entry  containing  ellipsis  to  be  representative  of  uncertainty.  Rule  #8, 
marking  an  entry  as  uncertain  if  it  contains  a  question  and  an  ellipsis,  was  correct  a  larger  percentage  of  the  time. 
Eliminating  rule  #9  increased  the  precision  significantly  (to  75.00;  as  shown  by  parenthesized  entry  in  Table  3). 
However,  eliminating  that  rule  reduced  the  recall  value  because  some  of  the  recognitions  would  have  been  valid,  but  the 
reduction  in  recall  was  less  significant  than  the  increase  in  precision,  as  shown  by  the  increase  in  the  F-score.  It  may  be 
that  further  investigation  could  refine  a  rule  or  ruleset  for  recognizing  uncertainty  in  chat  messages  containing  ellipsis. 

As  mentioned  before,  rule  #7,  labeling  messages  containing  the  phrase  “Which  <noun>?”  as  demonstrating  uncertainty, 
was  the  only  rule  developed  that  would  require  parsing  or  part-of-speech  tagging.  With  further  investigation,  or  within 
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other  chat  databases,  deeper  grammatical  analysis  might  produce  more  and/or  stronger  cues  of  uncertainty.  It  was  also 
noted  during  this  project  that  phrasing  of  messages  contributed  to  manual  detection  of  uncertainty.  For  example,  the 
question  “Do  you  know  if  we  should  track  this  target?”  conveys  more  uncertainty  than  the  question  “Where  is  target  A?” 
Although  both  are  requests  for  information,  the  tone  of  the  first  question  is  tentative,  whereas  the  tone  of  the  second 
question  is  more  business-like.  However,  this  cue  of  uncertainty  was  not  considered  during  algorithm  development  due 
to  lack  of  time.  This  would  be  something  to  consider  in  future  work. 

As  can  be  noted  from  Table  3,  maximum  entropy  analysis  recall  of  uncertainty  is  significantly  higher  than  that  of  cue 
analysis  -  it  was  able  to  recognize  many  more  of  the  chat  entries  presenting  uncertainty.  Its  precision,  however,  was 
lower  than  that  of  cue  analysis.  Statistical  analyses,  like  MaxEnt,  often  can  be  improved  (to  a  point)  with  additional 
training.  It  would  be  interesting  to  determine  the  amount  of  training  data  that  would  provide  the  best  performance.  It 
should  be  noted  that  MaxEnt,  as  we  applied  it,  is  not  able  to  detect  connotative  meanings  where  the  cues  are  present 
across  messages.  We  used  MaxEnt  as  a  “bag  of  words”  approach  to  message  classification,  meaning  that  MaxEnt  did 
not  take  the  order  of  the  words  within  each  message  into  account;  further  investigation  might  look  into  what  word  order 
and  relationships  can  bring  to  recognizing  connotative  meaning. 

Combining  the  rule-based  algorithm  and  MaxEnt  into  a  parallel  algorithm  was  implemented  upon  realizing  that  MaxEnt 
recall  scores  were  much  better  than  cue  analysis,  and  cue  analysis  precision  scores  were  a  bit  better  than  MaxEnt.  This 
parallel  algorithm  resulted  in  an  improvement  in  overall  performance;  the  precision  scores  are  higher  than  both  the  cue 
table  and  MaxEnt  alone,  however  the  recall  scores  are  lower  than  MaxEnt,  but  higher  than  the  cue  table.  The  reduction 
in  recall  (with  respect  to  MaxEnt)  is  not  as  significant  as  the  increase  in  precision  (with  respect  to  the  cue  table  and 
MaxEnt),  as  indicated  by  the  increase  in  F-score  (with  respect  to  both  methodologies).  It  would  be  interesting  to  try 
different  threshold  values  for  MaxEnt  in  the  parallel  algorithm  to  see  how  the  performance  of  the  parallel  algorithm  is 
affected  and  find  the  optimal  threshold  value. 


Table  3:  Uncertainty 

(Parenthesized  entries  are  results  of  cue  analysis  without  Rule  #9.) 


Cue  Table 

MaxEnt 

Parallel  | 

Recall 
x  100 

Precision 
x  100 

F-score 
x  100 

Recall 
x  100 

Precision 
x  100 

F-score 
x  100 

Recall 
x  100 

Precision 
x  100 

F-score 
x  100 

40.48 

(32.14) 

46.58 

(75.00) 

43.32 

(45.00) 

72.72 

39.35 

51.07 

55.95 

62.67 

59.12 

5.  SUMMARY 

Chat,  as  a  more  expressive  medium  than  formal  text,  contains  technical  content,  as  well  as  cues  expressing  emotions  the 
users  may  be  feeling.  In  order  to  exploit  both  of  these  facets  of  chat,  techniques  must  be  developed  to  go  beyond 
understanding  only  the  technical  content  and  recognize  any  connotations  chat  messages  may  express.  This  paper  has 
shown  how  encouraging  results  were  achieved  when  a  combination  of  rule-based  and  statistical  techniques  was  used  to 
recognize  uncertainty  in  military  chat  messages.  As  the  use  of  chat  increases  in  the  military  domain,  further  research  in 
this  area,  as  well  as  others,  is  necessary  to  enable  more  useful  automatic  intelligence  exploitation  of  chat  messages. 
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